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-Abstract- 

We investigate two problems for a class C of regular word languages. The C-membership problem 
asks for an algorithm to decide whether an input language belongs to C. The C-separation 
problem asks for an algorithm that, given as input two regular languages, decides whether there 
exists a third language in C containing the first language, while being disjoint from the second. 
These problems are considered as means to obtain a deep understanding of the class C. 

It is usual for such classes to be defined by logical formalisms. Logics are often built on top 
of each other, by adding new predicates. A natural construction is to enrich a logic with the 
successor relation. In this paper, we obtain new and simple proofs of two transfer results: we 
show that for suitable logically defined classes, the membership, resp. the separation problem for 
a class enriched with the successor relation reduces to the same problem for the original class. 

Our reductions work both for languages of finite words and infinite words. The proofs are 
mostly self-contained, and only require a basic background on regular languages. This paper 
therefore gives simple proofs of results that were considered as difficult, such as the decidability 
of the membership problem for the levels 1, 3/2, 2 and 5/2 of the dot-depth hierarchy. 
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Keywords and phrases Separation Problem, Regular Word Languages, Logics, Decidable Char¬ 
acterizations, Semidirect Product 

1 Introduction 

A central problem in the theory of formal languages is to characterize and understand the 
expressive power of high level specification formalisms. Monadic second order logic (MSO) 
is such a formalism, which is both expressive and robust. For several classes of structures, 
such as words or trees, it has the same expressive power as finite automata and defines the 
class of regular languages. In this paper, we investigate fragments of MSO over words. In 
this context, understanding the expressive power of a fragment is associated to two decision 
problems: the membership problem and the separation problem. 

For a fixed logical fragment J 7 , the T-membership problem asks for a decision procedure 
that tests whether some input regular language can be expressed by a formula from T. To 
obtain such an algorithm, one has to consider and understand all properties that can be 
expressed within J 7 , which requires a deep understanding of the fragment T. On the other 
hand, the IF-separation problem is more general. It asks for a decision procedure that tests 
whether given two input regular languages, there exists a third one in T containing the first 
language while being disjoint from the second one. 

Since regular languages are closed under complement, membership reduces to separation: 
a language is in T if and only if it can be separated from its complement. Usually, the 
separation problem is more difficult than the membership problem but also more rewarding 
with respect to the knowledge gained on the investigated fragment T. 
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These two problems have been considered and solved for many natural fragments of 
monadic second order logic. Among these, the most prominent one is first-order logic, FO(<), 
equipped with a predicate < for the linear ordering. The solution to the membership problem, 
known as the McNaughton-Papert-Schiitzenberger Theorem [24, 12], has been revisited until 
recently [7]. The theorem states that a regular language is definable in FO(<) if and only if 
its syntactic semigroup is aperiodic. The syntactic semigroup is a finite algebraic object that 
can be computed from any regular language. Since aperiodicity can be defined as an equation 
that needs to be satisfied by all of its elements, this yields decidability of FO(<)-definability. 
This result now serves as a template, which is commonly followed in this line of research. 

The separation problem has also been successfully solved for first-order logic [9]. Actually, 
the problem was first addressed in a purely algebraic framework, and was later identified as 
equivalent to our separation problem [2] . As for membership, this problem is still revisited 
today and a new self-contained and combinatorial proof was obtained in [22] . 

Motivation. We are interested in natural fragments of FO(<) obtained by restricting either 
the number of variables or the number of quantifier alternations allowed in formulas. Such 
restrictions in general give rise to several variants of the same fragment. Indeed, in most 
cases, the drop in expressive power forbids the use of natural relations that could be defined 
from the linear order in FO(<). The main example considered in this paper is +1, the 
successor relation, together with predicates min and max for the first and last positions 
in a word. This means that one can define two distinct variants of the same fragment 
depending on whether we decide to explicitly add these predicates in the signature or not. 
An example is the fragment £„, which consists of first-order formulas whose prenex normal 
form has at most (n — 1) quantifier alternations and starts with an existential block. Since 
defining +1 requires an additional quantifier alternation, £ n (<,+l , min, max) has indeed 
stronger expressiveness than £ n (<). The motivation of this paper is to obtain decidability 
results for such enriched fragments. 

State of the Art. Even when the weak fragment is known to have decidable membership, 
proving that the enriched one has the same property can be nontrivial. Examples include the 
membership proofs of +1, min, max) (Boolean combinations of £i(<, +1, min , max) 

formulas) and £ 2 (<,+l), which require difficult and intricate combinatorial arguments [10, 
8, 11] or a wealth of algebraic machinery [15, 17]. Another issue is that most proofs directly 
deal with the enriched fragment. Given the jungle of such logical fragments, it is desirable to 
avoid such an approach, treating each variant of the same fragment independently. Instead, 
a satisfying approach is to first obtain a solution of the membership and separation problems 
for the less expressive variant and then to lift it to other variants via a generic transfer result. 

This approach has first been investigated by Straubing for the membership problem [28] 
in an algebraic framework, and later adapted to be able to treat classes not closed under 
complement [17]. Transferring the logical problem to this algebraic framework requires 
preliminary steps, still specific to the investigated class, to prove that: 

1. A language is definable in the fragment if and only if its syntactic semigroup belongs to a 
specific algebraic variety V (e.g., the variety of aperiodic monoids for FO(<)), and 

2. Membership to V is decidable. 

Next, though this is not immediate, for most fragments of FO(<), it has been proved that 

3. When the weaker variant corresponds to a variety V, the variant with successor corresponds 
to the variety V * D, built generically from V. 


T. Place and M. Zeitoun 


3 


Hence, Straubing’s approach was to prove that 
4. the operator V i—»• V * D preserves decidability. 

Unfortunately, this is not true in general [3]. Actually, while decidability is preserved for all 
known logical fragments, there is no generic result that captures them all. In particular, for 
the less expressive fragments, one has to use completely ad hoc proofs. In the separation 
setting, things behave well: it has been shown that decidability of separation is preserved by 
the operation V i-> V * D [26]. While interesting when already starting from algebra, this 
approach has several downsides: 

h Dealing with algebra hides the logical intuitions, while our primary goal is to understand 
the expressiveness of logics. 

h Going from logic to algebra requires to be acquainted with new notions and vocabulary, 
as well as involved theoretical tools. Proofs are also often nontrivial and require a deep 
understanding of complex objects, which may be scattered in the bibliography. 

h Despite step 4, which is generic to some extent, arguments specific to the investigated 
class are pushed to steps 1-3, and they are often nontrivial. 


Contributions. We give a new proof that decidability of separation can be transferred from 
a weak to an enriched fragment. We present the result in two different forms. 

The first one is non-algebraic: we work directly with the logical fragments, without using 
varieties. The transfer result is generic and its proof mostly is: the only specific argument 
is an Ehrenfeucht-Frai'sse game that can be adapted to all natural fragments with minimal 
difficulty. The benefits of this new proof are that: 

1. It is self-contained and much simpler than previous ones. It only relies on two basic 
well-known notions: recognizability by semigroups and Ehrenfeucht-Frai’sse games. 

2. It works with classes that are not closed under complement, contrary to [26]. This allows 
us to capture the £ and n levels in the quantifier alternation hierarchy of first-order logic. 

3. Under an additional hypothesis on the logical fragment, which is met for most fragments 
we investigate and easy to check, the decidability result of the separation problem also 
extends to the membership problem. 

4. The proof adapts smoothly to infinite words using the notion of w-semigroups. 

The second form of our result is algebraic and generic. We prove that V i—> V * D preserves 
the decidability of separation for varieties, hence giving an elementary proof of a result of [26]. 
Even in this algebraic form, we completely bypass involved constructions or notions, such as 
pointlike sets for categories developed in [26], thus making the proof accessible. 

As corollaries, since £>£i(<) and S 2 (<) both enjoy decidable separation [6, 20, 21], we 
obtain that this is also the case for the fragments +1, min, max) and ^(Cj+l), 

known as levels 1 and 3/2 of the dot-depth hierarchy. These new results strengthen the 
previous ones [10, 8] that showed decidability of membership and were considered as difficult. 
We actually obtain that separation for £„(<, +1, min, max) reduces to separation for £„(<). 
Since we also transfer decidability of the membership problem, and since the fragments HE 2 (<) 
of Boolean combinations of £ 2 (<) formulas and Ea(<) have decidable membership [2 ] we 
deduce that the same holds for £>E 2 (<,-|-1) and S 3 (<,-|-1), known as levels 2 and 5/2 of the 
dot-depth hierarchy. 
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Organization of the Paper. In Section 2, we set up the notation and we present the 
separation problem and the logics we deal with. In Section 3, we present an overview of 
our main contribution. Section 4 is devoted to our technical tool: languages of well-formed 
words. In Section 5, we use it to prove our transfer result for all fragments from the logical 
perspective. In Section 7, we establish that decidability of the separation problem for the 
variety V entails the same for V * D. In order to instantiate this result for concrete logical 
fragments, thus obtaining an alternate proof of our transfer result, we rely on algebraic 
properties from the bibliography for each fragment and its enrichment: they are presented in 
Section 6.3. This paper is the full version of [23]. 

2 Preliminaries 

In this section, we provide preliminary definitions on regular languages defined by logical 
fragments and on separation. 

Words, Languages. We fix a finite alphabet A. Let A + be the set of all nonempty finite 
words and let A* be the set of all finite words over A. If it, v are words, we denote by 
it • u or by uv the word obtained by concatenating u and v. For convenience, we only 
consider, without loss of generality, languages that do not contain the empty word. That is, 
a language is a subset of A + . We work with regular languages, that is, languages definable 
by finite automata. 

Separation. Given three languages K,L,L', we say that K separates L from Li if 

L C K and K n 2/ = 0. 

If C is a class of languages, we say that L is C-separable from Li if there exists K £ C that 
separates L from IJ. Note that if C is closed under complement, L is C-separable from Li 
if and only if L' is C-separable from L. However, this is not true for a class C not closed 
under complement, such as the classes E„(<) of the quantifier alternation hierarchy, which 
we shall consider. 

Given a class C, the C-separation problem asks for an algorithm which, given as input two 
regular languages L , L' , decides whether L is C-separable from L'. The C-membership problem, 
which asks whether an input regular language belongs to C, reduces to the C-separation 
problem, as a regular language belongs to C iff it is C-separable from its complement. 

Logics. We investigate several fragments of first-order logic on finite words. We view a finite 
word as a logical structure made of a sequence of positions labeled over A. We work with 
first-order logic FO(<) using a unary predicate P a for each a £ A, which selects positions 
labeled with an a, as well as binary predicates '=’ for equality and ‘<’ for the linear order. 
Such a formula defines the regular language of all words that satisfy it. We will freely use 
the name of a logical fragment of FO(<) to denote the class of languages definable in this 
fragment. Observe that FO(<) is powerful enough to express the following logical relations: 

h First position, min(x ): \/y ~^(y < x). 

m Last position, max{x): \/y ->(x < y). 

m Successor, y = x + 1: x < y A ->(3z x < z A z < y). 

However, for most fragments of FO(<) this is not the case. For example, in the two- 
variables restriction F0 2 (<) of FO(<), it is not possible to express successor, as it requires 
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quantifying over a third variable. For these fragments JF, adding the predicates min, max 
and +1 yields a strictly more powerful logic J- + . Our goal is to prove a transfer result 
for such fragments: given a fragment, if the separation problem is decidable for the weak 
variant JF, then it is decidable as well for the strong variant JF + obtained by enriching JF with 
the above relations. The technique is generic, meaning that it is not bound to a particular 
logic. In particular, our transfer result applies to the following well-known logical fragments: 

h FO(=), the restriction of FO(<) in which the linear order cannot be used, and only 
equality between two positions can be tested. The enriched fragment FO(=, +1) ( min and 
max can be eliminated from the formulas) defines locally threshold testable languages [32] . 

h All levels in the quantifier alternation hierarchy of first-order logic. A first-order formula 
is £„(<) (resp. II n (<)) if its prenex normal form contains at most (n — 1) quantifier 
alternations and starts with an 3 (resp. a V) quantifier block. Finally, a £>£„(<) formula 
is a boolean combination of £„(<) and II n (<) formulas. 

Since for all fragments above S 2 (<), a formula involving min and max can be expressed 
without these predicates in the same logic, we shall denote the enriched fragments by 
£i(<, +1, min, max), BY i(<, +1, min, max), and then by S 2 (<, +1), B E 2 (<, +1), • • • 
m F0 2 (<), the restriction of FO(<) using only two reusable variables. The corresponding 
enriched fragment is F0 2 (<, +1), since min and max can again be eliminated from the 
formulas. 

Figure 1 summarizes all fragments the technique applies to. 


Weak variant 

FO(=) 

fo 2 (<) 

£„(<) 

BY „(<) 

Strong variant 

FO(=,+l) 

fo 2 (<,+i) 

S„(<, +1, min, max) 

BY n (<, +1, min, max) 


Figure 1 Logical fragments to which the technique applies. 


3 Overview of the Main Result 

In this short section, we explain our main contribution. We prove the following result. 

► Theorem 1. Let F and F + be respectively the weak and strong variants of one of the logical 

fragments in Figure 1. Then F + -separability can be effectively reduced to F-separability. 

We actually establish two versions of this theorem: 

h The first form, Theorem 4, is obtained by purely logical means. It is not entirely generic, 
since one of the directions of the reduction proof relies on Ehrenfeucht-Fraisse games 
adapted to the fragment under consideration. On the other hand, it has the advantage of 
having a direct, self-contained and elementary proof, built on a constructive reduction: 
from two regular languages, we effectively build two new regular languages, and we exhibit 
an F + separator for the original languages from an F separator for the new ones. 

h The second form, Theorem 22, is based on algebraic tools. The transfer result in this 
statement is presented on classes of finite ordered monoids or semigroups associated to 
the weak and enriched fragments respectively, through Eilenberg’s correspondence. It 
has the advantage of being completely generic: no hypothesis on the algebraic class is 
assumed. Even if this approach requires some vocabulary and machinery from algebra, 
its presentation is still much simpler than the previous one [26] . An issue however is that, 
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in order to apply this theorem to a specific fragment, one has to find beforehand which 
algebraic classes correspond to the weak and enriched fragments. In other terms, the 
statement indeed isolates a generic transfer property, but it relies on specific correspon¬ 
dences in order to be instantiated on a given fragment. Fortunately, the correspondences 
we need for treating all classes of Figure 1 have already been established. They will be 
recalled in Section 6.3. 

All logical fragments from Figure 1 have a rich history and have been extensively studied 
in the literature. In particular, the separation problem is known to be decidable for the 
following fragments: FO(=), F0 2 (<), £i(<), £>£i(<), S 2 (<) [6, 20, 21]. This means 
that, from our results, we obtain decidability of separation for FO(=,+l), F0 2 (<,+1), 
Ei(<, +1, min, max), £>Ei(<, +1, min, max) and S2(<,+1). 

Note that for FO(=,+l), F0 2 (<,+1) and £>£i(<, +1, min, max), the results could 
already be obtained as corollaries of algebraic theorems of Steinberg [26] and Almeida [2] . 
As explained above, an issue with this approach is that the proof of Steinberg’s result relies 
on deep algebraic arguments and is a priori not tailored to separation: the connection with 
separation is made by Almeida [2]. 

For £i(<, +1, min, max) and £ 2 (<,+l), the result is new, as Steinberg’s result does not 
apply to classes of languages that are not closed under complement. 

4 Tools for the Logical Approach: Semigroups, Well-Formed Words 

In this section, we define the main tools used for the logical approach in this paper. 

h We first recall the well-known semigroup based definition of regular languages: a language 
is regular if and only if it can be recognized by a finite semigroup. 

h Our second tool, well-formed, words , is specific to our problem and plays a key role in our 
transfer result. It is presented in Section 4.2. 

The tools specific to the algebraic approach are postponed to Section 6. 

4.1 Semigroups and Monoids 

We work with the algebraic representation of regular languages. Here we briefly recall the 
main definitions. We refer the reader to [13] for additional details. 

Semigroups. A semigroup is a set S equipped with an associative product, written s ■ t or st. 
A monoid is a semigroup S having a neutral element lg, i.e., such that s ■ lg = Is • s = s for 
all s £ S. If 5 is a semigroup, then S 1 denotes the monoid S U {Is} where Is ^ S is a new 
element, acting as neutral element. Note that we add such a new identity even if S is already 
a monoid. A semigroup morphism is a mapping a : S —> T from one semigroup to another 
which respects the algebraic structure: for all s, s' £ S, we have a(s • s') = a(s) • a(s'). For a 
monoid morphism, we require additionally S and T to be monoids and a(ls) = It- 

An element e £ S is idempotent if e • e = e. We denote by E(S) the set of idempotents 
of S. Given a finite semigroup S , it is folklore and easy to see that there is an integer w(S) 
(denoted by a; when S is understood) such that for all s of S, s u is idempotent: = s“s“. 

Note that A + and A* equipped with concatenation are respectively a semigroup and a 
monoid called the free semigroup over A and the free monoid over A. Let L C A + be a 
language and S' be a semigroup (resp. a monoid). We say that L is recognized by S if there 
exist a morphism a : A + — » S (resp. a : A* — > S) and a set F C S such that L = a 1 (F). 
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Semigroups and Separation. The separation problem takes as input two regular languages 
L,L'. It is convenient to work with a single object recognizing both of them, rather than 
having to deal with two. Let S , S' be semigroups recognizing L , L' together with the 
associated morphisms a, a', respectively. Clearly, L and L' are both recognized by S x S' 
with the morphism a x a' : A + —> S x S' mapping w to (a(w), a'(w)). From now on, we 
work with such a single semigroup recognizing both languages. Replacing S x S' with its 
image under a x of, one can also assume that this morphism is surjective. To sum up, we 
assume from now on, without loss of generality, that L and L’ are recognized by a single 
surjective morphism. 

4.2 Well-Formed Words 

In this section, we define our main tool for this paper. Assume that J- is the weak variant of 
one of the logical fragments of Figure 1 and let be the corresponding enriched variant. 
To any semigroup morphism a : A + —> S into a finite semigroup S, we associate a new 
alphabet A a called the alphabet of well-formed words. The main intuition behind this notion 
is that the J r+ -separation problem for any two regular languages recognized by a can be 
reduced to the .F-separation problem for two regular languages over A a . 

The alphabet A Q , called alphabet of well-formed words of a, is defined from a : A + —> S by: 

A a = ( E(S ) xSx E(S)) U (S x E(S)) U (E(S) xS) U S. 

We will not be interested in all words of A+, but only in those that are well-formed. A word 
w £ K is said to be well-formed if one of the following two properties holds: 

■ w is a single letter s £ S, 
m w has length A 2 and is of the form 


(so,/o)-(ei,si,/i) • • • (e n ,s n ,/ n )-(e n+ i,s n+ i) € (SxE(S))-(E(S) xSxE(S))*-(E(S) xS) 


with fi = ei-f-i for all 0 ^ i ^ n. 

► Fact 2. The set of well-formed words of Af is a regular language. 

We now define a morphism /3 : A+ —> S as follows. If s £ S, we set /3(s) = s, if 
(e, s) £ E(S) x S, we set j3((e,s)) = es, if (s, e) £ S x E(S), we set /3((s, e)) = se and if 
(e, s, /) £ E(S) x S x E(S), we set /?((e, s, /)) = esf. 

Associated Language of Well-formed Words. To any language L C A + that is recognized 
by a morphism a : A + —> S into a finite semigroup S , one associates a language of well-formed 
words L C A+: 

L = {w £ A+ I w is well-formed and /3 (ot) £ a(L)}. 

By definition, the language L C A+ is the intersection of the language of well-formed words 
with /I -1 (a(L)). Therefore, it is immediate by Fact 2 that it is regular, more precisely: 

► Fact 3. Let L C A + be a language recognized by a morphism a into a finite semigroup. 
Then, the associated language of well-formed words L C A+ is a regular language that one 
can effectively compute from a recognizer of L. 
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5 Logical Approach 

In this section, we prove Theorem 1 from a logical perspective. We begin with presenting 
our separation theorem, which will entail the membership theorem as a simple consequence. 

► Theorem 4. Let F and F + be respectively the weak and strong variants of one of the 
logical fragments in Figure 1 . 

Let L , L' be two languages recognized by a morphism a : A + —> S into a finite semigroup S. 
Let L, L' C A+ be the languages of well-formed words associated with L,L', respectively. 
Then L is F + -separable from L' iff L is F-separable from L/. 

Theorem 4 reduces /'“'■-separation to /"-separation. The latter was already known to 
be decidable for several weak variants in Figure 1, namely for FO(=) [19], F0 2 (<) [20], 
Ei(<) [6], HEi(<) [6, 20] and E 2 (<) [21]. Hence, we get the following corollary. 

► Corollary 5. Let L,L' be regular languages. Then the following problems are decidable: 
h whether L is FO(=, -\-T)~separable from L'. 

h whether L is FO 2 (<, +1)-separable from L'. 
h whether L is Yi(<,+1,min,max)-separable from L'. 
h whether L is BY, i(<, +1, min, max)-separable from L'. 
h whether L is E 2 (<, +l)-separable from L'. 

Notice that since the membership problem reduces to the separation problem, this also 
gives a new proof that all these fragments have a decidable membership problem. This is 
of particular interest for F0 2 (<,+1), BY^<,+1, min, max) and E 2 (<,+1) for which the 
previous proofs, which can be found in, or derived from [28, 1, 18], [10], and [8, 17, 15] 
respectively, are known to be quite involved. It turns out that for E 2 (<, +1), we can do even 
better and entirely avoid separation. Indeed, when F is expressive enough, Theorem 4 can 
be used to prove a similar theorem for the membership problem. 

► Theorem 6. Let F and F + be respectively the weak and strong variants of one of the 
logical fragments in Figure 1. Moreover, assume that for any alphabet of well-formed words, 
the set of well-formed words over this alphabet is definable in F. 

Let L be a language recognized by a morphism a : A + —> S into a finite semigroup S. Let 
L C A+ be the language of well-formed words associated with L. Then L is definable in F + 
iff L is definable in F. 

Proof. Set K = A + \ L and let K be the associated language of well-formed words. Observe 
that by definition, IK U L is the set of all well-formed words. 

If L is definable in F, then L is /"-separable from K, hence by Theorem 4, L is F + - 
separable from K, and so L is definable in F + . Conversely, if L is definable in F + , then L is 
J r+ -separable from K and by Theorem 4, L is /'-separable from IK. Since IK U L is the set of 
all well-formed words, L is the intersection of the separator with the set of all well-formed 
words, which by hypothesis is also definable in F. Therefore, L is definable in F. ◄ 

Observe that being well-formed can be expressed in n 2 (<): essentially, a word is well- 
formed if for all pairs of positions, either there is a third one in-between, or the labels of the 
two positions are “compatible”. Hence, among the fragments of Figure 1, Theorem 6 applies 
to all fragments including and above n 2 (<) in the quantifier alternation hierarchy. While 
such a transfer result was previously known [28, 17], the presentation and the proof are new. 
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In particular, since membership is known to be decidable for Il 2 (<) [15], BT, 2 (c) [21] and 
Sa(<) [21], we obtain new and simpler proofs of the following results. 

► Corollary 7. Given a regular language L, one can decide whether 

h L is definable by a S2(<,+1) (resp. by a Il 2 (<,+l) y ) formula. 
h L is definable by a £>E2(<,+1) formula. 

h L is definable by a £ 3 (<,+l) (resp. by a Il 3 (<,+l) y ) formula. 

It remains to prove Theorem 4. We devote the rest of the section to this proof. An 
important remark is that the proof of the right to left direction, presented in Section 5.1, is 
constructive: we start with an J- formula that separates L from L/ and use it to construct an 
J r+ formula that separates L from L'. Note that the argument is generic for all fragments 
we consider. 

On the other hand, the converse direction to which Section 5.2 is devoted, namely 
Proposition 13 below, requires a specific argument tailored to each fragment: a straightforward 
but tedious Ehrenfeucht-Frai'sse argument. 

5.1 From ^-separation to ^-separation 

We now prove that if L is ^-'-separable from L', then L is J r+ -separable from L'. We do so 
by building an J r+ -definable separator. This proof is constructive and entirely generic. We 
rely on a construction that associates to any word w £ A + a canonical well-formed word 
[w\ € A+. 

Canonical Well-formed Word Associated to a Word. To any word w of A + , we associate 
a canonical well-formed word [raj £ A+ such that a(w) = /3([raJ). This construction is 
adapted from [18] and is originally inspired by [28]. 

Fix an arbitrary order on the set E(S). For a position x of w , let u x £ A + be the 
infix of w obtained by keeping only positions x — (15”! — 1) to x. If position x — (IS 1 ! — 1) 
does not exist, u x is just the prefix of w ending at x. A position x is said distinguished if 
there exists an idempotent e £ E(S) such that a(u x ) • e = a(u x ). Additionally, we always 
define the rightmost position as distinguished, even if it does not satisfy the property. Set 
X\ < ■ ■ ■ < x n+ \ as the distinguished positions in w , so that x n+ \ is the rightmost position. 
Let ei,..., e n £ E(S) be such that for all 1 ^ i ^ n — 1, ej is the smallest idempotent such 
that a(u x J • et = a{u x J. 

If n = 0, i. e., if the only distinguished position is the rightmost one, set [w] = a(w) £ A a . 
Otherwise, we define [raj £ A+ as the word: 

[w\ = (a(w 0 ),e 1 ) • (ei, a(wi), e 2 ) • • • (e„_i, a(w n _i), e n ) ■ (e n , a(w n )) (1) 

where Wq is the prefix of w ending at position Xi, for all 1 Sj * St n — 1, w t is the infix of w 
obtained by keeping positions aq + 1 to aq +1 , and w n is the suffix of w starting at position 
x n + 1. Note that by construction, [taj is well-formed. 

The next statement follows from the definition of /3, and from the fact that by definition 
of the words Wi and of the chosen idempotents, we have a(w 0 • • • Wi)e.i+i = a(wo ■ ■ ■ Wi). 

► Fact 8. For all w £ A + , we have a(w) = /3([uiJ). Therefore, w £ L iff [raj £ L and 
w £ L' iff [w] £ L'. 
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To any distinguished position Xi in w, we now associate the position |_£j =i in |_^J • Our 
main motivation for using this construction is its local canonicity, which is stated in the 
following lemma. 

► Lemma 9. Let w £ A + . Then we have the following properties: 

(a) whether a position x is distinguished in w, and if so the label of position |ycj in ['H only 
depends on the infix of w of length 2|5| ending at position x. That is, if the infixes of 
length 2|*S'| ending at x and y are equal, then x is distinguished iff so is y, and in that 
case, the labels of |_rcj and |_ 2 /J in |_k>J ore equal. 

( b ) the label of the last position of |_wj only depends on the suffix of length 2 |S| of w. 

Proof. It is immediate that whether x is distinguished and if so the associated idempotent 
only depends on the infix u x of length at most IS 1 ! ending at x. Therefore, to prove (a), it 
suffices to show that all infixes Wi used in (1) are of size at most |S|, or in other words, that 
among |+ 1 consecutive positions, at least one is distinguished. So let us consider an infix 
a i • • • a| 5 |+i of w of length ISI + 1. It is immediate from the pigeonhole principle that there 
exist i < j such that a(ai ■ ■ ■ af) = a(ai ■ ■ ■ aj) = a(a\ ■ ■ ■ af) ■ (a(oj +1 • • • aj)) u . Hence, the 
position corresponding to is distinguished. The proof of the second assertion is similar. ◄ 

L is J r+ -separable from L'. We can now construct our separator. The construction follows 
from the next proposition. 

► Proposition 10. Let K C A+ that can be defined using an T formula <p. Then there exists 
an J- + formula T over alphabet A such that for every word w £ A + : 

w |= T if and only if [_wj |= ip. 

Proof. Proposition 10 follows from the following simple consequence of Lemma 9. 

► Claim 11. For any a £ A a there exists a formula 'ya.(x) of F + with a free variable x, such 
that for any w £ A + and any position x of w, we have w |= 7 a (x) iff x is distinguished and 
[:rj has label a in ["H ■ 

This claim holds since by Lemma 9, formula 7 a (x) only needs to explore the neighborhood 
of size 2|5| of x, which is trivially possible for all fragments F + we consider. To conclude 
the proof of Proposition 10, it suffices to define as the formula constructed from ip by 
restricting all quantifiers to positions that are distinguished and to replace all tests Psl(x) 
by 7a(a:)- + 

We can now finish the proof of Theorem 4. Assume that L is ^-separable from U and 
let ip be an T formula defining a separator. We denote by T the F + formula obtained from 
ip as defined in Proposition 10. We prove that T defines a language separating L from L'. 

We first prove that L C {w \ w \= T}. Assume that w £ L. Then by Fact 8 , we have 
[uij £ L. Hence, [wj |= ip and so w \= ’L by definition of 4/. The proof that L' C {w \ w \f SP} 
is identical: if w £ L', we have [_wj £ L' by Fact 8 . Hence, ^ T an( i w ^ by definition 
of T. ◄ 
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5.2 From ^-separation to ^-separation 

To complete the proof of Theorem 4, it remains to prove that if L is J 7+ -separable from 
L' , then L is ^-'-separable from L 7 . The proof is this time specific to each fragment, as it 
requires, in one direction of the reduction, a dedicated (but simple) Ehrenfeucht-Fraisse 
argument. We actually prove the contrapositive: if L is not ^-separable from L 7 , then L is 
not ^“'"-separable from L'. We rely on a construction that is dual to the one used previously: 
to any well-formed word tu £ Aj and any integer i > 0, we associate a canonical word 

Mi e A+. 


Canonical Word Associated to a Well-formed Word. To any s £ S, we associate an 
arbitrarily chosen nonempty word |"s] £ A + such that a(|~s"|) = s (which is possible since 
a has been chosen surjective). Let i > 0. From a well-formed word tu £ A+, we build a 
word Hi £ as follows. If tu = s £ S, then Hi = T S 1 for all *• Otherwise, we have by 
definition 


m — (^ 0 ; ^l)(^l) ^ 1 ) C2) * * * (On—1 Sn— iTn) (&n , ’^n) • 


For a natural i > 0, we set 

Hi = Ml Mf Mil Mf • • • \en-iV M-il \e n y Ml • 


Recall that /3 is the morphism (3 : A+ —> S mapping tu to soeisi • • • s ra _ie n s n . Since ej £ E(S) 
for all j, it is immediate that a(Hi) = /3(u), hence we get the following fact: 

► Fact 12. For all i > 0 and all well-formed tu £ A+, we have tu £ L (resp. £ L 7 } if and 
only if Hi £ £ (resp £ L'). 

We now proceed with the proof. We use the classical preorders associated to fragments 
of first-order logic. The (quantifier) rank rank(</?) of a first-order formula <p is the largest 
number of quantifiers along a branch in the parse tree of ip. Formally, rank(</?) = 0 if p 
is an atomic formula, rank(->c/?) = rank(<^), rank(^>i V pf) = max(rank(y>i)i rank((/? 2 )) and 
rank(3a; tp) = rank(</>) + 1. 

Given it, v £ A + , we write u v if any J- + formula of rank k that is satisfied by u is 
satisfied by v as well. Similarly, for tu, v £ A+, we write tu =<(/- w if any J- formula of rank k 
that is satisfied by tu is satisfied by v as well. One can verify that and are preorders, 
as well as the following standard fact: 


L C A + is definable by an T + formula of rank k iff L = {u' \ 3u £ L st. u « 7 } 
L C Aj is definable by an F formula of rank k iff L = {tu 7 | 3tu £ L st. tu tu 7 }. 


( 2 ) 


Note that when T and F + are closed under complement, then =<;*, and are actually 
equivalence relations. We can now state the main proposition of this direction. 

► Proposition 13. For any k £ N, there exist l £ N and i £ N such that for any well-formed 
words tu,tu 7 £ A+ satisfying tu tu 7 , we have Hi jt" 1 Ml*- 

Before proving Proposition 13, we explain how to use it to show the first direction of 
Theorem 4. We argue by contrapositive: assume that L is not J r -separable from L 7 . By 
definition this means that no language definable in J- separates L from L 7 . In particular, for 
any £, the language 


{tu 7 | 3tu £ L st. TU TU 7 }, 
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which is definable in T by (2), cannot be a separator. Note that this language contains L. 
Hence, for all f G N, there exist u G L and nr' £ L' such that m ur'. We deduce from 
Proposition 13 and Fact 12 that for all k £ N, there exist u £ L and u' £ L' such that 
u v!. It follows, again by (2), that L is not J r+ -separable from L' , which concludes 
the proof. 

We now prove Proposition 13 for fragments we are interested in. As already explained, 
this proposition is proved using classical, but specific Ehrenfeucht-Frai'sse arguments for each 
fragment. While each proof is specific, the underlying ideas are similar. 

Here, we consider two main cases, J- = F0 2 (<) and F = £„(<) for some n. Note that 
we will obtain the case F = BT, n (<) as a simple consequence of the case F = £„(<). Finally, 
we leave out the case F = FO(=), as the argument is essentially a copy and paste of the 
argument for £„(<). 

5.2.1 F0 2 (<) and F0 2 (<,+1) 

Observe that since F0 2 (<) and F0 2 (<, +1) are both closed under complement, the preorders 
=4k and are actually equivalence relations. To avoid confusion with other fragments, we 
denote by = k and =t , these two equivalences. We prove the following proposition, which 
clearly entails Proposition 13. 

► Proposition 14. For any k £ N, given upur 7 £ A+ we have the following implication: 

m =k u => M 2k = k M 2fc - 

This is proved using an Ehrenfeucht-Frai'sse argument. We first define the Ehrenfeucht- 
Frai'sse game associated to F0 2 (<) (i.e., corresponding to =k ) and then explain how to 
adapt it to = k ■ 

Ehrenfeucht-Frai'sse Game. The board of the F0 2 (<)-game consists of two words and lasts 
a predefined number k of rounds. There are two players called Spoiler and Duplicator. At any 
time during the game there is one pebble placed on a position of one word and one pebble 
placed on a position of the other word, and both positions have the same label. When the 
game starts, both pebbles are placed on the first position of each words. Each round starts 
with Spoiler choosing one of the pebbles, and moving it inside its word from its original 
position a; to a new position y. Duplicator must answer by moving the other pebble in the 
other word from its original position x' to a new position y'. Moreover, x' and y' must satisfy 
the same relations as x and y among ’<’ and the label predicates. 

Duplicator wins if she manages to play for all k rounds. Spoiler wins as soon as Duplicator 
is unable to play. 

The F0 2 (<,+l)-game is defined similarly with additional constraints for Duplicator. 
When Spoiler makes a move, Duplicator must choose her answer y' so that x' and y' satisfy 
the same relations as x and y among +1, < and the label predicates. 

► Lemma 15 (Folklore). For any integer k and any words v,v', we have the following facts: 

h v =k v' iff Duplicator has a winning strategy in the k-round FO 2 {<)-game on v and v'. 

h v v' iff Duplicator has a winning strategy in the k-round FO 2 (<,+!)-game on v 
and v'. 

To prove Proposition 14, let tu,tu' £ A+, and set u = [u] 2 fc an d u ' = W e want 

to show that u = k [ v!. In view of Lemma 15, it is enough to prove to exhibit a winning 
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strategy for Duplicator in the k -round F0 2 (<, +l)-game played on u and v!. We call Q this 
game. The strategy involves playing a shadow F0 2 (<)-game S on ui and u 7 . Observe that 
by hypothesis and by Lemma 15, Duplicator has a winning strategy for k rounds in the game 
S. We begin by setting up some notation to help us define Duplicator’s strategy in Q. 

Notation. Assuming that tu =/. u 7 , we need to prove that u v!. If u e S' or u 7 S S', then 
iu = u 7 = s £ S (since the only well-formed word that contains letter s £ S is s itself) and 
the result is immediate. 

Otherwise, by hypothesis, the words u and u 7 are of the form 
m — (^0; 0)(0? ’ 5 1,62) * ‘ * (c m , S m ) 

= ( s 0> e 'l)( e 'li e 2) ■ ■ ■ 1 S 'm') ■ 

In particular, observe that since ui =*, u 7 and the labels of the leftmost and rightmost positions 
occur only at these positions in u and ui 7 , we have (so, ei) = (sq, e]) and (e m , s m ) = (e' m ,, s' m ,). 
For the sake of simplifying the presentation, we assume that for all i ^ m, we have 
|"Sil = a,: £ A and |" ei\ = bi £ A (this does not harm the generality of the proof). Similarly, 
for all i < to 7 , we assume that [s'] = a' £ A and [e 7 ] =b' i £ A. By definition, we have 

u = N 2fe = a 0 ( 6 i) 2 fe ai( 6 2 ) 2fe • • • (b m ) 2k a m 
u' = K 1 2k = a' 0 (b[) 2k a[(b' 2 ) 2k ..-(b' m ,) 2k a' m ,. 

To treat the beginning and the end of the words uniformly as the other factors, we set 
b 0 ,b' 0 ,bm +1 ,b'm l+1 as the empty word. 

Winning Strategy. Let £ be the number of remaining rounds at some point in the game. 
We define an invariant I(£) that Duplicator has to satisfy when playing. Assume that the 
pebbles in u,u' are at positions x,x' in Q and that the pebbles in tu,ui 7 are at positions i,i' 
in S. Then, I(£) holds when so do all following properties: 

1. Duplicator has a winning strategy for playing i rounds in S. In particular, this means 
that i,i' have the same label, and therefore that (bi, cii, 6 i+i) = (&',, a 7 ,, 6 7 , +1 ). 

2. Pebbles x and x' are inside the identical factors (bi) 2k a.i(b i+1 ) 2k and ( 6 (,) 2 fe a(,( 6 (, +1 ) 2fc , 
and at the same relative position. 

3. There are at least £ copies of bi (resp 6 7 ,) to the left of x (resp. a; 7 ) and £ copies of bi + 1 
(resp. 6 7 , +1 ) to the right of x (resp. a; 7 ). 

It is clear that T(k) holds at the beginning of the game. Moreover, since Duplicator will 
follow her strategy in S, Item 1 will be fulfilled. Assume now that I(l + 1) holds and that 
there are (l + 1) rounds left to play. We explain how Duplicator can answer a move by 
Spoiler while enforcing I(t). Assume that Spoiler moves the pebble in u to a new position y 
(the dual case, when Spoiler plays in u !, is treated similarly). There are two distinct cases. 

h If y remains in the factor (bi) 2k ai(bi + i) 2fc and satisfies Item 3 of T(£), then Duplicator 
simply copies Spoiler’s move in (& 7 ,) 2 fe a 7 ,( 6 7 , +1 ) 2? \ The positions i and i! remain unchanged 
and I(£) is clearly satisfied. 

h Otherwise, observe that by T(£ + 1 ), Spoiler’s move y cannot be equal to x ± 1 . This 
means that Duplicator has to answer in the same direction and on the same label as 
Spoiler did, but not on positions x’ ± 1. Because Item 3 is not satisfied, position y belongs 
to some (bj) 2k aj(bj + i) 2k with j 7 ^ i, with at least £ copies of bj to its left and £ copies of 
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bj+i to its right. To compute her answer, Duplicator simulates a move by Spoiler in S by 
moving the pebble from position i to position j. From her winning strategy in S, she 
obtains a position j' in u' such that ( bj , a,j, 6_/+i) = (&'■/, o'/, 6'-, +1 ). She picks as position 
y' the same relative position in (&',) 2fc a',(&', +1 ) 2fc as y in (bj) 2k aj(bj + \) 2k . Observe that 
since j ^ i, we have y ^ x ± 1. Hence, this is a legal move for Duplicator. The new 
positions y, y' ,j,j' satisfy 1(1), which terminates the proof. 

5.2.2 £ n (<) and E n (<, +1, min, max) 

We fix some n £ N. We keep using the symbols =4k and to denote the preorders 
associated to £„(<) and £ n (<,+l , min, max). Furthermore, we denote by =*, and 
the equivalence relations associated to BT, n (<) and #£„(<, +1, min, max). We prove the 
following proposition, which again yields Proposition 13 with k! = k and i = 2 k+1 . 

► Proposition 16. For any k € N, given tu,tu' € A+ we have the following implications: 

u ur |^ui^ 2 /e+i ^k [”m "] 2*;+1 

TU = k u' H 2 W =r K1 2 ,+i . 

Observe first that the second implication is an immediate consequence of the first one. 
Indeed, since BT, n formulas are boolean combinations of T, n formulas, we have 

v =4k v' and v' =4k v if and only if v =k v' 

v v' and v' v if and only if v v'. 

Therefore, we concentrate on the first implication. As for F0 2 (<), this an Ehrenfeucht- 
Fra'isse argument. We first define the Ehrenfeucht-Fra'isse game associated to £„(<) ( i.e 
corresponding to =4k) and then explain how to adapt it to ^ . 

Ehrenfeucht-Frai'sse Game. The board of the £„(<)-game consists of two words v,v' and 
there are two players, again called Spoiler and Duplicator. Moreover, initially, there exists a 
distinguished word among v, v' that we call the active word (this word may change as the 
game progresses). The game is set to last a predefined number k of rounds. When the game 
starts, both players have k pebbles. Contrary to the F0 2 (<)-game, once a pebble is dropped, 
it cannot be moved again during the game. Finally, there is a parameter that gets updated 
during the game, a counter c called the alternation counter. Initially, c is set to 0. It may be 
incremented, but it has to remain bounded by n — 1. 

At the start of each round £, Spoiler chooses a word, either v or v'. Spoiler can always 
choose the active word, in which case both c and the active word remain unchanged. However, 
Spoiler can only choose the word that is not active when c < n — 1, in which case the active 
word is switched and c is incremented by 1 (in particular, this may happen at most n — 1 
times). If Spoiler chooses v (resp. v'), he puts a pebble on a position xe in v (resp. x' £ in v'). 

Duplicator must answer by putting a pebble at a position x' e in v' (resp. Xe in v). 
Moreover, Duplicator must ensure that all pebbles that have been placed up to this point 
verify the following condition: for all £ 1,^2 ^ £, the labels at positions Xi 17 x' ti are the same, 
and Xe x < xe 2 if and only if x' t < x'^. 

Duplicator wins if she manages to play for all k rounds, and Spoiler wins as soon as 
Duplicator is unable to play. 

The S n (<, +1, min, max)-g&vne is defined similarly with the following additional con¬ 
straint for Duplicator: at any time, for all £\,£2 ^ £, we have X£ x = xe 2 + 1 if and only if 
x' (i = x'g 2 + 1, ?Tim(x^ 1 ) if and only if min(x^) and max(x^) if and only if max(x' ti ). 
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► Lemma 17 (Folklore). For all k £ N and v,v', we have the following facts: 

h v =4k v' iff Duplicator has a winning strategy in the k-round Yi n (<)-game on v and v' 
with v as initial active word. 

h v v' iff Duplicator has a winning strategy in the k-round T, n (<,+l,min,max)-game 
on v and v' with v as initial active word. 

We now prove Proposition 16. Let hi, hi' £ A+. We have to prove that fm”| 2fe+ i =4^ 
Kl 2fc+ i. In view of Lemma 17, this can be done by giving a winning strategy for Duplicator 
in the corresponding fc-round S„(<,+l,mm,mai’)-game. We call Q this game. Duplicator’s 
strategy involves playing a £ n (<)-game S , called the shadow game, on m and hi 7 . By 
hypothesis and by Lemma 17, she has a winning strategy in k rounds in the shadow game S. 
We begin by setting up some notation that will help us define Duplicator’s strategy. 


Notation. Set u = |"u] 2fc+1 and u' = [ur'] 2fc+ i. Assuming that ni =<:*, hi 7 , we need to prove 
that u =<;)*" 1 u'. If hi £ S or m 7 £ S, then hi = hi 7 = s £ S (again, the only well-formed word 
that contains the letter s £ S is s). Therefore, u = u' and the result is immediate. 

Otherwise, by hypothesis, the words hi and hi 7 are of the form 


m (^0; 0)(0j S'l . 62 ) * ‘ * (c m , S m ) 

= K> e'l) (e 7 !, s 7 !, e' 2 ) • • • (e 7 m ,, s' m ,) 


In particular, observe that since hi tu 7 and the labels of the leftmost and rightmost positions 
occur only at these positions in hi and hi 7 , we have (s 0 , ei) = (s' 0 , e[) and (e m , s m ) = (e(„,, 

For the sake of simplifying the presentation, we assume that for all i ^ to, we have 
[~Si] = Oj £ A and [~ej~| = bi £ A (this does not harm the generality of the proof). Similarly, 
for all i ^ to 7 , we assume that [s'] = a 7 £ A and |~e'"| = h\ £ A. By definition, we have 


u 

u 


/ 


M 2 fc+i = a 0 (6i) 2 ai(6 2 ) 2 

K1 ^ = am* k+1 a'^Y 


■ ( b m ) 2 
•(C) 2 


a„ 

k + 1 


Again, to treat the beginning and the end of the words uniformly as the other factors, we set 
b 0l h' 0l b m+1 ,b' rn , +1 as the empty word. 


Winning Strategy. Let i be the number of remaining rounds at some point in the game. We 
define an invariant X(£) that Duplicator has to satisfy when playing. 

As she plays, Duplicator associates to each position i £ hi, (resp. i! £ hi 7 ) a set of positions 
in u (resp. u') called the set of marked positions for i (resp. for i'). All marked positions 
for i (resp. for i') must belong to the bi, or 6, t+1 (resp. b\,, a'-, or 6,/ + 1 ) positions in u 
(resp. in u'). Initially, for all i (resp. i'), only a, (resp. a 7 ,) is marked for i (resp. for i'). 
Duplicator may define more positions as marked as the game progresses. All these new 
marked positions will be positions holding pebbles in Q. 

Assume that there are £ rounds left to play and that pebbles have already been placed 
on u , u' in the main game Q and on hi, hi 7 in S in a way that satisfies the conditions of 
both Ehrenfeucht-Frai’sse games. We denote by eg the alternation counter of the main 
game Q and by cs that of the shadow game S. For all i £ hi (resp. i! £ hi 7 ) we denote by 
X\(i) < ■■■ < x mi (i) (resp. x[(i') < ■■■ < x’ m i {i')) the marked positions for i (resp. i'). 
Then I{£) holds if the following properties hold: 

1. Duplicator has a winning strategy for playing at least £ more rounds in S. Furthermore, 
either cs > eg, or cs = eg and the active words in S and Q are either hi and u, or hi 7 
and v!. 
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2. Any position x £ u (resp. x' £ u!) that holds a pebble in Q is marked for some i £ u 
(resp. i! £ ni / ) holding a pebble in S. Conversely, any position that is marked for * £ in, 
(resp. i! £ u') is either ai (resp. a ! v ) or a position holding a pebble in Q. 

3. For all i £ u (resp. i' £ W), we have x mi (i) < X\(i + 1) (resp. x' m / (i') < x\(i! + 1)). 
Moreover, there are at least 2 t+1 copies of bi + \ (resp. b' il+1 ) that are strictly between 
these two positions. 

4. Let i, i' be positions of in, in' on which there are corresponding pebbles in S (meaning that 
one position corresponds to a move of Spoiler and the other one is Duplicator’s answer). 
Observe that since i,i' have the same label, we have ai = a',, bi = b\, and 6, + i = 6(, +1 . 
In that case, the number of marked positions for i is the same as the number of marked 
positions for i', that is rrii = mi*. Furthermore, for all j ^ mf. 

_ Xj(i) is the at = a'/ position of u iff x'j(i') is the ai = a(, position of u', and 
_ Xj(i) holds a pebble of Q iff holds the corresponding pebble. 

Finally, given j < rrii, let d and d! be the number of positions that are strictly between 
Xj{i) and Xj+i^i) (resp. between xb(i') and a;'■ +1 (^ , )). Note that by the condition above 
these positions are all labeled by b i = or all labeled by b i+1 = 6', +1 . We require that 
either d = cf, or d ^ 2 l and d! ^ 2 e . 


i i + 1 

di O’i+l 

- 7^7 


—©A- 

-a— 

-Ad- 

-- 

-O'- 

-0- 

-O- - 

bi 

CL-i 

bi+ 1 

bi+i 

bi+i 

di+i 

h +2 

Xi (i) 

x 2 {i) 

Z3W 

x A (i) 

xi{i + 1) x 2 (i + 1) 

x 3 (i+ 1) 


^ 2 e 


> 2 e+1 


xi(i') 

b'A= bi) 

-(Sh^- 


x 2 (i') X 3 («') X4(i') 

di bi +1 6 »+i 

—O-P- 

1 / ^ 

I t 

| , ' 

' / 

1 / -- ^ 
di 


Figure 2 Marked positions 

Figure 2 shows positions i and j + 1 in u and i’ in ur' corresponding to i in the 5-game, 
as well as marked positions for i and * + 1 in u (resp. for i' in u'). Greyed positions are 
the ones holding a pebble. Note that by Item 2, all marked positions in u (resp. u') except 
possibly some ai (resp. a') positions have to hold a pebble. Item 4 means that the picture 
for u' and ui 7 look the same: for instance, since there are m, = 4 marked positions for i in u , 
there are also 4 marked positions for i' in u ', where i and i' are corresponding moves in S. 
Furthermore, all are marked except the a = a t = a' t , position in both u and u', and this 
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position has the same index in both lists of marked positions for i (resp. i'), namely index 2. 
Finally, distances between “corresponding” consecutive marked positions in u and u' are 
either equal, or both are at least 2 e . In Figure 2, x<i(i) — x\(i) ^ x' 2 (i') — therefore 

these quantities have to be at least 2 e . 

It is clear that I(k) holds before the initial round. Assume now that there are (l + 1) 
rounds left to play and that X(l + 1) holds. We explain how Duplicator can play in order to 
enforce I(t). Assume that Spoiler puts a pebble at a position x £ u in Q (the case when 
Spoiler plays in u' is symmetric). 

Duplicator first defines a position i' in u' as follows. If there is already a pebble on i in 
6>, then we set i' as the position holding the matching pebble in ni'. Otherwise, Duplicator 
simulates a move by Spoiler in S by putting a pebble on position i and sets i' as the answer 
she obtains from her strategy in S. Note that by hypothesis all pebbles in m, u' (including 
i,i') satisfy the conditions of the £ n (<)-game. We now distinguish two cases depending on 
the position x. 

There exists i £ u such that X\(i) ^ x ^ x mi (i). We distinguish two subcases: 

h If x is already a marked position Xj(i) for i, then Duplicator answers by putting a 
corresponding pebble on xb(i'). Note that this answer is correct by hypothesis on i,i' for 
the £„(<)-game and by hypothesis on the marked positions for i,i' as stated in Item 2 
of T(t + 1). Since both positions were already marked for i, i ', it is then simple to verify 
that I(t) holds. 

h Assume now that x is not yet marked. Since positions are always marked, x is a bi or 
a b i+ i position. Assume that x is a bi position (the other case is similar). Recall that 
rrii = to,/ by Item 2 in I(( + 1). Let j be such that Xj(i) < x < Xj + i(i). By Item 4 of 
I(£+l) it is immediate than one can find an answer x' £ v! such that ) < x' < a ;'- +1 (z / ) 
and Item 4 of I(t) remains satisfied with x , x' as new marked positions for i, i' . Again 
this answer is correct by hypothesis on i. i' for the £ n (<)-game and by hypothesis on the 
marked positions for i, i' as stated in Item 2 of I(t + 1). It is then simple to verify that 
1(f) remains satisfied. 


There exists i£u such that x mi _ 1 (i — 1) < x < Xi(i). From Item 3 in X(l + 1), we know 
that there are at least 2 e+2 copies of bi between x mi _ 1 (i — 1) and x\{i). It follows that there 
are either at least 2 f+1 copies of bi between ir TOi l (i — 1) and x or at least 2 e+1 copies of bi 
between x and x±(i). Since both cases are symmetric, assume that we are in the first case: 
there are at least 2 e+1 copies of bi between x mi _ 1 (i — 1) and x. 

Let d be the number of copies of bi between x and X\ (z), i.e., x = Xi(i) — (d + 1). If 
d < 2 f , we set x' £ u' as the position x' = X\{i') — (d + 1). Otherwise we set x' £ u' as the 
position x' = aq (i) — (2 e + 1). In both cases, x' is Duplicator’s answer and we set x,x' as 
new marked positions for i. i 1 . Note that this answer is correct by hypothesis on i , i' for the 
S n (<)-game. It is immediate that 1(1) are satisfied by choice of x'. 
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Tools for the Algebraic Approach: Varieties, Semidirect Product 


In this section, we set up the terminology needed for the algebraic version of our result. As 
explained in the introduction, we use varieties to capture our classes of separator languages. 
Informally, a variety is a class of finite algebras canonically associated to such a class of 






18 


A Transfer Theorem for the Separation Problem 


separators. We build our algebraic version of the transfer theorem from a weak fragment J- 
to its enriched version J- + on three ingredients: 

11. A solution to the separation problem for T, as in the logical approach. 

12. An algebraic description of the weak variant J- as a variety V. 

13. An algebraic description of the strong variant F + as the variety V * D, built from V and 
from a fixed variety D with an operator called the semidirect product. 

These three points have already been solved for all fragments of Figure 1. The transfer result, 
Theorem 22 below, reduces separability by languages associated with V * D to separability 
by languages associated with V. Therefore, relying on the solution of Items 12 and 13, it 
provides a reduction from the separation problem by F languages to the separation problem 
by F + languages. If in addition Item II if fulfilled, then the latter problem is decidable. 

This section is devoted to making these notions precise. It is organized as follows: we first 
recall the notion of variety of ordered semigroups and monoids, and how varieties can be used 
to capture classes of regular languages we are interested in. We then recall the construction 
of the semidirect product of two varieties in order to define the variety V * D. We finally 
present a bibliography giving, for each fragment F in Figure 1, references for solving the 
above questions 11-13. The statement and the proof of the transfer result, Theorem 22, is 
postponed to Section 7. 

6.1 Varieties 

A variety of semigroups (resp. monoids) is a class of finite semigroups (resp. monoids) closed 
under three natural operations: finite direct product, subsemigroup (or sub monoid), and 
homomorphic image. This makes it possible to define classes of regular languages based on 
the monoids that recognize these languages: a variety V defines the class of all languages 
recognized by semigroups (resp. monoids) in V. There is an issue however: all classes of 
languages defined in this way have to be closed under complement, since the set of languages 
recognized by any semigroup is closed under complement. This prevents us from capturing 
logical fragments that are not closed under complement, such as S 2 (<). This problem has 
been solved in [14] with the notions of ordered semigroups and monoids. Intuitively, such a 
semigroup is parametrized by a partial order and the set of languages it recognizes is then 
restricted with respect to this partial order. 

Let us recall this notion, which leads to the definition of variety of ordered semigroups or 
monoids. All classes considered in this paper may be defined in terms of such varieties. 

Ordered Semigroups. An ordered semigroup is a pair (S, ^) where S' is a semigroup and ^ 
is a partial order on S, which is compatible with multiplication: s ^ t and s 1 ^ t' imply 
ss' ^ tt!. To simplify the notation, we will often omit the partial order ^ when it is clear 
from the context and simply speak of an ordered semigroup S. Observe that any semigroup 
endowed with equality as the partial order is an ordered semigroup. In particular we view 
A+ as an ordered semigroup with equality as the partial order. 

If (S, ^s) and (T, ^ t ) are ordered semigroups, an ordered semigroup morphism is a 
mapping a : S —> T which is a semigroup morphism and preserves the partial order, i.e., for 
all s,s' £ S, s s' =>■ a(s) a(s'). Let L C A + and {S, ^) be an ordered semigroup. 
Then, L is said to be recognized by ( S , ^) if there exist an ordered semigroup morphism 
a : A + —> S and F C S, such that L = a~ 1 (F) and F is upward closed, that is: 


s £ F and s < t => t £ F. 
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When ^ is trivial, then any subset of S is upward closed, and we recover exactly the classical 
notion of recognizability by semigroups presented just above. However, when ^ is nontrivial, 
the set of recognized languages gets restricted because of the additional condition on the 
recognizing set F. In particular it may happen that a language is recognized by ( S , ^), while 
its complement is not (its complement is recognized by (S,^)). 

Varieties of Ordered Semigroups. A variety of finite ordered semigroups is a class V of 
finite ordered semigroups that satisfies the following properties: 

1. V is closed under ordered subsemigroup: if (5, <) £ V, then (T, <) £ V when T is a 
subsemigroup of S and the order on T is the restriction of the order on S. 

2. V is closed under ordered quotient: if (S', <) £ V and a : (S, <) —>• (T, is a surjective 

ordered semigroup morphism, then we have (T, £ V. 

3. V is closed under Cartesian direct product: if (Si, <i), (S 2 , < 2 ) £ V, then we have 
(Si x S 2 ,^) £ V, where the semigroup Si x S 2 is equipped with the componentwise 
multiplication and (si,S 2 ) ^ (^ 1 ^ 2 ) if si ^1 t\ and S 2 ^2 £ 2 - 

Note that for technical reasons, we have to consider both varieties of semigroups and 
monoids: non-enriched fragments correspond to varieties of monoids while enriched ones 
correspond to varieties of semigroups. For the sake of simplifying the presentation, we only 
give the definitions for semigroups. Ordered monoids and varieties of ordered monoids are 
defined in a similar way, as well as the non-ordered versions. 

Varieties and Classes of Languages. To any variety V of ordered semigroups (resp. of 
ordered monoids), we can associate the class of all languages that are recognized by an 
ordered semigroup (resp. ordered monoid) in V. As for logics and for the sake of simplifying 
the presentation, we may abuse notation and use V to denote both a variety and the class of 
languages it defines. 

It turns out that all classes from Figure 1 can be defined in such a way. Therefore, they 
all have an associated a variety. This follows actually from a general result, Eilenberg’s 
theorem. One should however keep in mind that in this framework, there is: 

(а) Eilenberg’s theorem, a generic result establishing a correspondence between varieties and 
classes of languages (indexed by alphabets) enjoying certain closure properties: closure 
under Boolean operations, inverse morphisms and left and right residuals. It was first 
obtained by S. Eilenberg for classes closed under complement, and later generalized by 
J.E. Pin [14] when this assumption does not necessarily hold. 

(б) Specific instances of Eilenberg’s theorem, one for each particular class, relating such a 
class of languages with a corresponding variety of ordered semigroups or monoids. 

We will not state Eilenberg’s theorem precisely, as we do not need it. On the other hand, 
Item ( b) is useful to provide an alternate version of our transfer result, Theorem 4, in the 
algebraic framework of Section 7. This alternate version, Theorem 22, is generic, in the sense 
that it transfers decidability of the separation problem for a variety V to the variety V * D, 
with no assumption on the variety V. However, in order to instantiate this generic theorem 
for our logical fragments, we need Item 13 above, i.e., to show that for each weak fragment F, 
if the variety associated to F is V, then the variety associated to the enriched variant F + 
is V * D. In other words, we shall rely on the aforementioned specific connections, Item (6) 
above, between a class of languages and a variety of ordered semigroups or monoids. Each 
fragment will be described in Section 6.3, and the fact that for all of them, if F corresponds 
to the variety V, then F + corresponds to the variety V * D is stated in Theorem 18. 
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6.2 The Semidirect Product 

Let M be an ordered monoid and let T be an ordered semigroup. A semidirect product of 
M and T is an operation which is parametrized by an action of T on M and outputs a 
new ordered semigroup, whose base set is M x T. In particular, one can obtain different 
semidirect products out of the same M and T, depending on the chosen action. 

Let “+’ and be the operations of M and T respectively. Note that we choose to denote 
the operation on M additively. This is for the sake of simplifying the presentation. However, 
this does not mean that we assume M to be commutative. An action V of T on M is a 
mapping (£, s) i -> t * s from T 1 x M to M such that, for all s,s' £ M and all t, t! £ T: 

■ £*(£'* s) = (£■£')* s. h t * (s + s') = t * s + t * s'. 

h 1 T * s = s. h £ * 1 M = 1 m - 

h if s ^ s', then t * s ^ t * s'. ™ if £ ^ £', then t * s ^ t' * s. 

Given a fixed action V of T on M , the semidirect product M *T of M and T with respect 
to action * is the set M x T equipped with the following operation: 

(s, £) • (s', t') = (s + £ * s', £ • t') 

and the componentwise order: 

(s, £) < (s', £') if s ^ s' and £ < t'. 

One can verify that this does yield an ordered semigroup, see [16]. 

Given a variety V of ordered monoids and a variety W of ordered semigroups, we denote 
by V * W the variety of ordered semigroups generated by all semidirect products of the form 
M *T, with M £ V and T £ W, where * ranges over all possible actions of T on M. 

The Variety D. We will only use the semidirect product with semigroups T from a specific 
variety, denoted by D. This is because such a semidirect product V * D of V with D is often 
related to the enrichment with the successor relation of the fragment captured by V. 

The variety D consists of all finite ordered semigroups S such that for all s £ S and all 
e £ E(S ), we have se = e. From a language perspective, a language L is recognized by a 
semigroup in D iff there exists k £ N such that membership of a word w to L only depends 
on the suffix of length k of w. 

The reason why we introduce such semidirect products is the following theorem, which 
gathers several nontrivial results from the literature listed in Section 6.3, and which answer 
our requirement 13 towards our transfer theorem. 

► Theorem 18. Let V be a variety corresponding to a fragment J- from the ones presented 
in Figure 1. Then, the variety corresponding to the fragment T + V * D. 

6.3 Algebraic Characterizations of Logically Defined Fragments 

In this section, we consider Items 12 and 13, which were to be solved in order to apply our 
generic theorem. All logical fragments of Figure 1 correspond to varieties that have been 
fully identified. We present, for each such fragment, bibliographic references relating its 
weak and strong variants to varieties. In particular, we will see that Theorem 18 holds: for 
each fragment whose non-enriched variant corresponds to a variety V of ordered monoids, its 
enriched version corresponds to the variety of ordered semigroups V * D built from V. 
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6.3.1 First-order with Equality 

The logic FO(=) is the restriction of FO(<) in which the linear order cannot be used, and only 
equality between two positions can be tested. It is folklore that FO(=)-definable languages 
are exactly those that can be defined using a monoid in the variety of monoids ACom of 
aperiodic and commutative monoids. 

The enriched fragment is FO(=,+l), as min and max can be eliminated in the formulas. 
It defines locally threshold testable languages [32]. In [30], it was proved that FO(=,+l)- 
definable languages are exactly those that can be defined in ACom * D. In particular this 
was used to solve the membership problem for FO(=, +1). 

That separation is decidable for FO(=) is simple (essentially, the problem can be reduced 
to the decision of Presburger logic, see [19]). Hence Theorem 4 and Theorem 22 yield two 
different proofs of the following corollary. 

► Corollary 19. Let L , L' be regular languages. It is decidable to test whether L is FO(=, +1)- 
separable from L'. 

As we already explained, while the proof of Corollary 19 is new, the result itself is not. 
A specific proof was presented in [19] and the result can also be obtained through indirect 
means by combining results from [2, 26]. 

6.3.2 Quantifier Alternation Hierarchy 

One can classify first-order formulas by counting the number of alternations between 3 and 
V quantifiers in the prenex normal form of the formula. For i £ N, a formula is said to be 
£j(<) (resp. n,(<)) if its prenex normal form has (i — 1) quantifier alternations (that is, i 
blocks of quantifiers) and starts with an 3 (resp. a V) quantifier. For example, a formula 
whose prenex normal form is 


3x\3x-^ix33xA ip(xi,X 2 ,X 3 ,Xi) (with ip quantifier-free) 

is X^ 3 (<). Observe that a n^(<) formula is by definition the negation of a £*(<) formula. 
Finally, a #£,(<) formula is a boolean combination of £,;(<) formulas. 

Both this hierarchy and the enriched variant are known to be strict [4, 33]. Furthermore, 
they correspond to well-known hierarchies of classes of languages: the non-enriched hierar¬ 
chy corresponds to the Straubing-Therien hierarchy [27, 29], while the enriched hierarchy 
corresponds to the dot-depth hierarchy [5]. Note that for all fragments above £ 2 (<), the 
predicates min and max can be eliminated from the logic. Hence, we denote the enriched 
fragments by £i(<, +1, min,max), BYli(<,+l,min,max), £ 2 (<>+l)) ■ • ■ 

Solving the membership problem for all levels in both hierarchies has been an open 
problem for a long time. As of today, only the lower levels are known to be decidable. 
Historically, H£i(<) and BY,i(<,+l,min,max) have been investigated first. It is known 
from [25] that £>£i(<) has decidable membership and corresponds to the variety of monoids J. 
For £>£i(<, +1, min, max), decidability was proved in [10], as well as the correspondence 
with the variety of semigroups J > 1 = D in [28]. 

The fragments £i(<) and £ 2 (<) were shown to have decidable membership in [15]. 
Moreover, the authors also prove that each of these two fragments correspond to varieties of 
ordered monoids and that £i(<,+l ,min,max) and £ 2 (<>+l) correspond to the varieties 
of semigroups obtained by taking the semidirect product with D. From this correspondence, 
they obtain decidability of £i(<, +1, min, max). This is more involved for £ 2 (<,+l) and 
was proved later in [8]. 


22 


A Transfer Theorem for the Separation Problem 


Recently, membership has been shown to be decidable for both BY 2 {<) and E 3 (<) [2: ]. 
These results can be transferred to BY 2 (<, +1) and S 3 (<, +1) using a result by Straubing [28], 
or Theorem 6 in this paper. For all levels above, the membership problem is open. 

Separation is known to be decidable for Ei(<) [6], BY, i(<) [20, 6] and E 2 (<) [21]. Hence 
Theorem 4 and Theorem 22 yield two different proofs of the following corollary. 

► Corollary 20. Let L,L' be regular languages, then the following problems are decidable: 

h whether L is Ei(<,+1 , min, max)-separable from L'. 
h whether L is BY i(<, +1, min, max)-separable from L'. 
h whether L is E 2 (<, -{-1)-separable from L'. 

As we explained in Section 2, the result for BY\{<, +1, min, max) as it can also be 
obtained through indirect means by combining results from [2, 26]. On the other hand, the 
results are new for both Ei(<,+1 ,min,max) and E 2 (<,+1). 

6.3.3 Two-Variable First-Order Logic 

The logic F0 2 (<) is the restriction of FO(<) using only two (reusable) variables. The 
corresponding enriched fragment is F0 2 (<,+1) ( min and max can be eliminated from the 
logic). 

In [31], it was proved that F0 2 (<) and F0 2 (<,+1) correspond respectively to the 
varieties DA and DA * D. This immediately yields decidability of membership for F0 2 (<). 
For F0 2 (<, +1), this additionally requires a deep algebraic result by Almeida [1] (a simpler 
self-contained proof also exists [ L8] ). The separation problem has been proved to be decidable 
for F0 2 (<) in [20]. Hence Theorem 4 and Theorem 22 yield two different proofs of the 
following corollary. 

► Corollary 21. LetL,L' be regular languages. It is decidable to test whether L is F0 2 (<, +1)- 
separable from L'. 

As we explained in Section 2, while the proof is new, the result itself is not. It can also 
be obtained through indirect means, again by combining results from [2, 26]. 

7 Algebraic Approach 


We are now ready to prove Theorem 22. Recall that we have a non-trivial variety V of ordered 
monoids, two languages L and L' recognized by a morphism a : A + —> S , and L, L' C A+ 
the associated languages of well-formed words. 

We prove that L is (V * D)-separable from L' if and only if L is V-separable from U. We 
prove each direction in its own subsection. 

We now present an algebraic version of Theorem 4: the operator V i-> V* D preserves 
decidability of separation. 

We would like to emphasize again that the ideas behind this theorem are essentially the 
same as for Theorem 4. In particular, proofs only rely on elementary notions, thus bypassing 
complex constructions usually used to prove this kind of result, even if the statement itself 
requires some additional algebraic vocabulary. 

The section is organized in three parts. 
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h We first briefly recall how classes of languages corresponding to our logical fragments are 
given an algebraic definition: for each fragment, an associated class of finite semigroups 
(or monoids) V, a variety , has already been characterized, such that the class of languages 
definable in the fragment is exactly the class of languages that are recognized by a 
semigroup (or monoid) of V. 

h In the second part, we define what “adding the successor relation” means in this context. 
Given a variety V, this generally corresponds to considering a new variety built on top 
of V via an operation called the semidirect product. This new variety is denoted V * D. 

_ Finally, in the last part, we state our main theorem: for any variety V, separability for 
the variety V = 1 = D reduces to separability for the variety V. 

7.1 Main Theorem 

We have now the machinery needed to state our main theorem. For any variety of ordered 
monoids V, we reduce (V * D)-separability to V-separability. 

► Theorem 22. Let V be a non-trivial variety of ordered monoids. Let L and L' be two 
languages both recognized by the same morphism a : A + —> S into a finite semigroup S. Set 
L,L/ C A+ as the languages of well-formed words associated to L, L', respectively. Then, L 
is (V * D) -separable from L' if and only if L is V -separable from L'. 

In view of Theorem 18, Theorem 22 applies to all fragments we introduced. This means 
that Theorem 4 can be given an alternate indirect proof within this algebraic framework by 
combining Theorem 22 and Theorem 18. Hence, this also yields another proof of Corollary 5. 

The proof of Theorem 22 is presented in the rest of this section. As it was the case for 
Theorem 4, the proof is both elementary and constructive: if there exists a separator for L 
and L' in V, we use it to construct a separator for L and L' in V * D. 

This rest of the section is divided in three parts. In the first one, we recall the formal 
definition of the semidirect product operation. In the next two ones, we prove both directions 
of Theorem 22. 

7.2 From (V * D)-separability to V-separability 

We prove that if L is (V * D)-separable from L' , then L is V-separable from L'. Note that we 
reuse the construction which associates a canonical word [w] i £ A + to every word w e A+ 
and natural i > 1 (see Section 5.2 for details). 

Assume that L is (V * D)-separable from L'. This means that there exists an element of 
(V * D) separating L and L'. By [16, Prop. 3.5], such an ordered semigroup is an ordered 
quotient of an ordered subsemigroup of a semidirect product M *T, with M € V and T £ D. 
Therefore, M *T itself separates L and L'. Hence, there is some upward closed F C M * T 
and a morphism S : A + —► M *T such that S~ 1 (F) separates L from L'. 

We construct a separator in V for L and L'. Set T = {t\,... ,t n } and observe that since 
V is non-trivial, it contains an ordered monoid N containing at least n distinct elements. 
We choose such elements t\,..., t! n of N. The choice is essentially arbitrary, but we ask 
t\,... ,t' n to be pairwise incomparable with respect to the partial order We prove that L 
can be separated from L' using the ordered monoid M = M x A* £ V (recall that a variety is 
closed under Cartesian product). For an element t = U of T, we denote by t' the element t\ 
of N. 
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We define a morphism 7 : A+ —> M as follows. Let ui be the idempotent power ui(M * T) 
of M *T. Set a = (e, s,f) £ A a , so that |~a~| w = wfw s wj. Let 6(wf) = ( m e ,t e ) £ M *T 
and = (m, t). We define 7 (a) £ M x N as follows: 

(t e * m, t') when / = l s , 

(t e * m, ljv) otherwise. 

This defines a morphism 7 : A a —> M x N £ V. It remains to prove that 7 recognizes a 
separator of L and L'. This is a consequence of the next lemma. 

► Lemma 23. Let w £ A+ be well-formed, and set ( m,ti) = #(["w] w ). Then 7 (w) = 

Before proving the lemma, we use it to conclude the proof. Define F C M by F = {(m, t'f) \ 
(m,ti) £ A}}. One can verify that F is upward closed. We claim that 7 _ 1 (F) separates L 
from L'. 

Assume first that w £ L. By Fact 12, [w] £ L , hence <5(|"w] ) £ F. It then follows 

from Lemma 23 that y(w) £ F. Conversely if w £ L', we have ^ F. It then follows 

from Lemma 23 that y(w) ^ F which terminates the proof. We now prove Lemma 23. 

Proof of Lemma 23. We first show that the first component in M of and of 

7 (w) are equal. The proof consists in a straightforward but tedious computation. Set 
w = ai • • • a p £ A+ that is well-formed. Set a * = (ej_ 1 , Sj, ef) and recall that, in view of the 
definition of [w] i given in Section 5.2, we have chosen words w ei and w Si such that: 

Mb =<_! 'Wsi "Wei 

For each idempotent e = e i; set S(w^) = (m e , t e ) £ M *T and for each element s = s i} let 
S(w s ) = ( m s ,t s ) £ M*T. Note that by definition of w, the element ( m e ,t e ) = S(wf) = S(w e ) ul 
is idempotent, so ( m e ,t e ) = (m e + t e * m e ,tff). In particular, t e is idempotent in T. Further, 
we have for all i: 

m ei + t ei * m ei = m ei . (3) 

For each a, = (e^-i, Si, ef), we then have 

‘HIVIJ = 

= {m ei _ 1 , t ei _ x )(m Si , t Si )(m ei , t e .) 

m ei _ 1 + t e ._ t * m Si + * rn ei , 

where, for computing the 2nd component, we used the fact that t ei is idempotent in T £ D. 
Similarly, by definition we have |"w]^ = (w eo ) ul w Sl (w ei ) ul ■ ■ ■ (w Sp _ 1 ) ul w Sp (w ep ) UJ , and 

5 (M J = ^w^ 0 )5{w Sl )5{w ei Yw ■ ■ ■ 5{w ep _f)^ 5{w Sp )w Sp {w ep ) u 

= (m e0 , t eo ) (m Sl ,t si ) (m ei , t ei ) ■ • ■ (m 6p _ 1 ,t eti _ 1 )(m Sp , t Sp )( m ep , t £p ) 

= (weo + (teo * m si + te 0 t Sl * m ei ) H-h (t Bp _ 1 * m Sp + t £p _ 1 t Sp * m ep ), fe„) ■ 

(5) 

Again, for the last equality, we used the definition of the semidirect product and the fact 
that each t ei is an idempotent in T, which implies, since T £ D, that t • t ei = t ei for all t £ T. 

Using (3) for each i, one can replace m ei in (5) by m e; + t ei * m ei . Taking into account 
that t ei is idempotent in T, this yields for this first component of <5(|"w] w ) the value 

m eo +t eo *m eo + (t eo *m Sl +t eo t Sl *m ei +t ei *m ei )-\ - \-(t ep _ 1 *m Sp +t ep _ 1 t Sp *m ep +t ep *m ep ) 
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Observe that since w is well-formed, eo = Is, hence m eo = ljvr, which is the neutral 
element for the “+’ operation on M. In the same way, e p = Is, hence m ep = 1 m, and 
therefore, using the last axiom of an action, we deduce that t ep *rn ep = 1m- Hence, these two 
elements can be removed from the expression of the first component of J(|"w] aJ ). Therefore, 
this first component can be rewritten, using associativity, as: 

(t eo *m eo +t eo *m Sl +t eo t Sl *m ei )-\ - {t ep _ 1 *m ep _ 1 + t ep _ 1 *m Sp +t ep _ 1 t Sp *m ep ). ( 6 ) 

On the other hand, in view of (4) and by definition of 7 , the first component of 7 (a,) £ 
M x IV is 


(£ e< _! * (wie s _i + t ei _ 1 * m Si + £ei_i£si * m e J = (£ ei _i * m ei _i + t ei _ 1 * m Si + * m e J. 

(7) 


Therefore, one can compute the first component of y(w) = 7 (ai • • • a p ) = 7 (ai) • • • y(a p ) by 
summing the values (7) for i = 1,... ,p (recall that the operation on M is noted additively), 
which gives the value computed in ( 6 ). Hence we have shown that the first component in M 
of 5([w] w ) and of y(w) are equal. 

It remains to check that when the second component of <5([w] ) is equal to some t £ T, 
then the second component of y(w) is the corresponding element t' £ N. This is simpler: by 
definition of a well-formed word, we have ti^ls for i < p, and e p = Is. By definition of 7 , 
it follows that the second component of y(w) is the second component of 7 (a p ), namely t p . 
Now, since T £ D, the second component of (5(|"a] ai ) is t p , which concludes the proof. ◄ 


7.3 From V-separability to (V * D)-separability 

We prove that if L is V-separable from L', then L is (V * D)-separable from L'. Note that we 
reuse the construction which associates to every word w £ A + a canonical word [wj e A+ 
(see Section 5.1 for details). 

Assume that L is V-separable from L'. This means that we have a morphism 7 : A* -7 M 
with M an ordered monoid in V and FCi upward-closed such that 7 - 1 (F) separates L 
from L'. We need to construct a separator in V * D for L and L'. The main idea is to define 
a morphism, which given w £ A + , computes 7(|_uiJ). This is slightly technical however as 
the morphism needs some machinery to make this computation. 

We begin with some notations. To every word w £ A + , we associate an element 
lab(w ) £ M. Let x be the last position in w and consider the construction of [wj . If £ is 
distinguished, we set lab(w ) = 7 (a) with a the label of |_arj in [wj. Otherwise, we simply set 
lab(w ) = 1m- We can now start the construction of our separator. We have to define the 
following objects: 

h An ordered semigroup T £ D. 

— An ordered monoid M £ V. 

m An action of T on M yielding a semidirect product M *T. 

m A morphism S : A + M *T which recognizes the desired separator. 


Definition of T. We set T as the set {w £ A + \ |w| ^ 215*1} equipped with the following 
operation. If w,w' £ T , we set w ■ w' as the suffix of length 2|S'| of the word ww' when 
ww' has length > 2|S'| and as ww' otherwise. One can verify that this operation is indeed 
associative and that T £ D. We use equality as the partial order on T. 
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Observe that we have a natural morphism p : A + —> T such that p(w) is w if |tt>| ^ 2|^1, 
and p(w) is the suffix of length 2|5'| of w otherwise. Observe that by Lemma 9, we have the 
following fact. 

► Fact 24. For every w £ A + , lab(w) = lab(p(w)). 


Definition of M. We set M £ V as the Cartesian product M T (recall that as a variety of 
ordered monoids, V is closed under Cartesian product). 

► Remark. Since we intend to take a semidirect product of M and T, we will denote the 
semigroup operations of both M and M additively in order to clarify the presentation. 


Definition of M *T. If w £ T and f £ M (i.e., f is a mapping / : T 1 —> M), we set w ■ f as 
the mapping g : T 1 — > M such that g(u) = f(u ■ w). One can verify that is an action of T 
on M. In the remainder of the proof, we denote by M *T the semidirect product of M and 
T with respect to this action. 

Definition of 8. Set fid : T 1 —> M defined as follows. We set //d(lr) = 1 m and fid(w) = 
lab(w ) when w £ T. We can now define 8 : A + —► M *T. Let a £ A + , we set 8(a) as the 
pair (f a , a) where f a = a ■ fid, i-e., the mapping f a :w i—>• fid(wa). It now remains to prove 
that 5 does recognize a separator of L from L'. This is a consequence of the following lemma. 

► Lemma 25. Let w £ A + , (/, u) = 8(w ) and end(u ) as the label of the last position in [mJ • 
Then, 

7(H) = /(M • 7 (end(u)). 

We first use the lemma to conclude the proof. Set F C M * T as the set 
F = {(f,u) | /(It) ■ j(end(u)) £ F}. 

One can verify that F is upward closed (this is essentially because F is upward-closed). It is 
immediate from Lemma 25 that 8(w) £ F iff 7 (|_u>J) £ F. We claim that 8~ 1 (F) separates L 
from L'. 

Assume first that w £ L, we need to prove that w £ S~ 1 (F). By Fact 8, we have |H G L, 
hence 7 (H) e ^ an( l S(w) £ F. Similarly, if w £ L', |_wj £ L', hence 7 (H) ^ ^ an( l 
8(w) F which terminates the proof. It finally remains to prove Lemma 25. 

Proof of Lemma 25. Set w = a\ ■ ■ ■ a n and HI = an • • • a m . By definition, we have: 


/ = p(ai) ■ fid + p(aia 2 ) ■ fid~\ - 1 - p(a\ ■ ■ ■ a n ) ■ fi d 


By definition of fid and by Fact 24 this means that: 

/( It) = lab(ai) + lab(aia 2 ) + • • • + lab(a\ ■ ■ ■ a n ) 

It is then immediate from the definition of |_wj that /(It) = 7(^1 • • ■ a m _i). Hence 7(HI) = 
/(It) • 7 (end(w)). This finishes the proof since u is the suffix of length 2|5| of w, and 
therefore end(u) = end(w) by Lemma 9. ◄ 
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Conclusion 


We proved that separation is decidable over finite words for the following logical fragments: 
FO(=,+l), Si(<,+1 , min, max), BT,i(<,+l,min,max), E 2 (<,+1) and F0 2 (<,+1). To 
achieve this, we presented a simple reduction to the same problem for the weaker fragments 
FO(=), Sr(<), HEi(<), S 2 (<) and F0 2 (<). 

The reduction itself is entirely generic to all fragments and its proof is elementary, and 
also mostly generic. In particular, the technique can be used to prove that the reduction 
works for other natural fragments of first-order logic. An interesting example to which 
these results apply is the quantifier alternation hierarchy within F0 2 (<) (known as the 
Trotter-Weil hierarchy, and which is decidable [34]). However, the separation problem for 
classes in this hierarchy has yet to be investigated. We also obtained direct proofs that 
membership is decidable for £>E 2 (<,+1) and E 3 (<,-|-1). 

Finally, we presented an algebraic formulation of this reduction, which recovers a previously 
known result by Steinberg [26], while having a much simpler proof. One can expect extending 
these results to other fragments, such as enrichment with modulo predicates. Another 
advantage of this technique is that it can be extended in a straightforward way to the same 
logical fragments over words of infinite length. This yields identical transfer results. We 
leave the presentation of these results for further work. 
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